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Despite autonomous navigation is one of the most proliferate applications of 
three-dimensional (3D) point clouds and imagery both techniques can poten- 
tially have many other applications. This work explores urban digitization tools 
applied to 3D geometry to perform urban tasks. We focus exclusively on com- 
piling scientific research that merges mobile laser scanning (MLS) and imagery 
from vision systems. The major contribution of this review is to show the evolu- 
tion of MLS combined with imagery in urban applications. We review systems 
used by public and private organizations to handle urban tasks such as historic 
preservation, roadside assistance, road infrastructure inventory, and public space 
study. The work pinpoints the potential and accuracy of data acquisition systems 
to handled both 3D point clouds and imagery data. We highlight potential future 
work regarding the detection of urban environment elements and to solve urban 
problems. This article concludes by discussing the major constraints and strug- 


gles of current systems that use MLS combined with imagery to perform urban 
tasks and to solve urban tasks. 
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1. INTRODUCTION 

The accelerated urban growth of these days requires essential changes in what we usually know as 
urban construction, production, and management. In this sense, society’s efforts to adapt itself and try to 
dominate the urban revolution comes along with a remarkable and continuous technological progress. Public 
and private organisms that regulate, inspect, and monitor public services have adopted technological tools for 
specific tasks like urban digitization tools for map building in projects related to land, road, or natural resource 
management. Additionally, areas such as topology, video games development, and historic preservation are 
in the constant search for look for better technologies, sensors, and techniques to generate the most precise 
digitization of urban elements. 

This work explores urban digitization tools applied to three-dimensional (3D) geometry to perform 
urban tasks. Nowadays, mobile laser scanning (MLS) systems on board moving vehicles are one of the most 
common tools for data acquisition in urban environments. These on board systems combine high-range laser 
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sensors for 3D point cloud acquisition, panoramic cameras for image color and texture acquisition, and a global 
positioning system (GPS) for location tracking. MLS systems can be helpful for a wide variety of applications 
in areas such as archeology (e.g. land exploration), video game development (e.g. 3D object reconstruction), 
topology, and historic preservation. Additionally, MLS data is used to evaluate buildings (construction man- 
agement, and civil engineering) and streets (urban maintenance, and electrical work). Moreover, in urban tasks, 
MLS data is particularly useful for projects benefiting communities (environmental projects). 

Extensive research on MLS systems is easily verifiable thanks to the numerous works reported in 
the literature. Each of these initiatives introduces novel MLS developments from different perspectives. It is 
essential to compile data processing methods in photogrammetry and remote sensing and compare different 
scattered 3D points registration techniques [1]. Besides, Cheng et al. [2] made a brief review of LIDAR 
technologies, including terrestrial, aerial, and satellite laser scanning across multiple applications. 

MLS has proven to be an excellent tool for urban management-related applications such as building 
facade reconstruction, road inventory, land exploration, and structural monitoring, to name a few. The re- 
searchers [3]-[6] examine different MLS applications to discuss technological advances in data registration , 
geo-referencing of scanned data, and environment change detection and deformation monitoring for engineer- 
ing surveying and structural and civil engineering. Also, Matikainen et al. [7] is an exclusive review for power 
line corridor remote sensing methods exploring different LIDAR technologies. Many studies such as those 
reported in [8]-[10] report, describe, and compare urban object recognition and classification methodologies 
using MLS. 

Some years ago, urban data acquisition systems were integrated only by laser scanner devices to re- 
place vision systems as the primary tool for handling urban tasks. However, it did not take long for vision 
systems to be integrated into the solution again. Including camera vision systems into urban data acquisition 
systems provides texture and color data to laser scanning data. In this sense, the researchers [11], [12] examine 
multiple procedures for 3D reconstruction and 3D modeling from LIDAR and image integration for visualiza- 
tion and aesthetics. Similarly, Ma et al. [13] studied LiDAR-based mobile mapping and surveying technology 
by analyzing the performance of exceptional mobile terrestrial laser scanning systems. Additionally, the au- 
thors reviewed the positioning, scanning, and imaging devices integrated into these systems. Similarly, works 
such as those proposed by [14]-[16] have managed to compare mobile LIDAR technology, including system 
components. More recently, Wang et al. [17] is a review of MLS systems for urban 3D modeling where check 
the efficiency and stability of these systems. The main application are 3D modeling, LIDAR simultaneous 
localization and mapping, point cloud registration, feature and object extraction, semantic segmentation, and 
processing applying deep learning. In addition, Gao et al. [18] presents a 3D LIDAR dataset to review the size, 
diversity and quality, which are the critical factors in training deep models. They showed an organized survey 
of 3D semantic segmentation too including the latest research trend using deep learning techniques. 

The combination of MLS data with imagery in vision systems helps retrieve more details of urban 
objects. Consequently, MLS initiatives combining imagery may be classified depending on the object to be 
detected, such as road markings, road signs, and pole-like objects. The works reviewed in this paper introduce 
MLS developments and discuss the performance and applicability of such developments to demonstrate that 
MLS systems are suitable for different urban management applications. 

This work exclusively compiles scientific research that merges both MLS data and imagery from 
vision systems. A similar work is proposed by [19], who conducted a study on photogrammetry and remote 
sensing to point at the benefits of using both types of data for disadvantage compensation. More recently, 
Zhong et al. [20] is a survey of the MLS and camera systems fusion and enhancement. The work review both 
two sensors regarding depth completion, 3D object detection, segmentation and tracking. In the case of our 
review, the main interest is to know how current initiatives for handling urban management task successfully 
integrate MLS with imagery, since we found a gap regarding the use of 3D point cloud data along with imagery 
in urban applications. 

The main contribution of this review is to trace the evolution of MLS with imagery technologies ap- 
plied in urban applications. We do not review, however, autonomous vehicle MLS systems, as we consider they 
deserve a study of their own. We review applications that can be used by both public and private organizations 
in urban environments such as construction management, urban maintenance, environmental projects, electri- 
cal work, and civil engineering for management tasks such as historic preservation, roadside assistance, road 
infrastructure inventory, and public space study. Hence, this paper pinpoints the potential of data acquisition 
systems that use both 3D point clouds and imagery data to detect urban elements and solve urban problems. 
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We include a section to describe MLS and vision system technology to provide an overview of the tools 
used to accomplish such urban tasks. Namely, we focus on listing MLS developments showing data processing, 
segmentation, detection, and classification capabilities. Likewise, we discuss the methods commonly used to 
evaluate the performance of such systems as well as the accuracy of their results. Then, in the discussion 
section, we expose how merging MLS data with imagery from vision systems is being used most and why. 
Finally, as our research conclusions, we highlight opportunity areas for urban tasks in which MLS data and 
imagery data are exploited for both academic and industrial purposes. 


2. RESEARCH METHOD 

Our review discusses merely scientific research that integrates MLS data with imagery from vision 
systems regarding urban tasks. The selected works meet the criteria: i) Data acquisition systems with at least 
the following components: the MLS system, panoramic cameras, and a GPS receiver; ii) Implementation of 
these systems in urban applications for either public or private organizations in urban environment applications 
such as construction management, urban maintenance, environmental projects, electrical work, or civil engi- 
neering; iii) Works discussing the data processing process and describing the object detection or segmentation 
method of the data acquisition system; iv) If the work discusses a data acquisition system with object classi- 
fication capabilities, then, the used classification and evaluation techniques must be described along with the 
accuracy of the system’s results; and v) Some of the main keywords for paper selection are the next listed be- 
low: LiDAR, 3D point cloud, vision system, mobile mapping systems, mobile laser scanner systems, panoramic 
cameras, GPS receiver, imagery, urban application, urban management, urban environment, urban object de- 
tection, urban object segmentation, urban object classification, buildings, threes, road signs, road elements, and 
road markings. 

Our search covers approximately ten years of research to trace the evolution of data acquisition in 
urban contexts. We focus exclusively on compiling scientific research that merges MLS data with imagery data 
from vision systems. We leave aside data acquisition developments mounted on autonomous vehicles, since 
we consider they deserve a study of their own. Also, we consulted the datasheets of every system reviewed. 
The search of works was conducted on search tools such as Google Scholar, ResearchGate, Scopus, Mendeley, 
and multiple journal engines. In the last ten years, there has been a remarkable increase in urban digitization 
developments. Figure |1| depicts the yearly distribution of works published on state-of-the-art technologies 
merging MLS data with vision system images in the context of urban projects. The dotted line shows clearly 
that there is a trend that will continue increasing. As cities around the world continue to grow exponentially, 
these developments will remain on the rise to help societies overcome emerging urban challenges and dominate 
the latest urban revolution. Figure 2 shows the principal LiDAR sensors count, some works do not specify the 
sensor. 


Number of works 


2010 2011 2012 2013 2014 2015 2016 2017 2018 2019 2020 2021 
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Figure 1. State-of-the-art scientific article count per year that merges both MLS data and vision system 
images. The dotted line shows the trend line 
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Figure 2. Principal LiDAR sensors count, some works do not specify the sensor 


3. MOBILE DATA ACQUISITION SYSTEMS 

The past couple of decades have witnessed the exponential growth of computational calculation and 
data management technologies, thus leading to rapid computational technology developments, especially in 
term of computer vision and computer graphics. In this sense, mobile mapping systems (MMS), especially 
MLS systems, have demonstrated to be effective tools in a wide range of urban applications. The satisfactory 
performance of MLS systems is due to their sensory system, which provides dynamic vision to any vehicle 
using the components: i) A laser scanner sensor: range sensor, LiDAR sensor; ii) Vision sensors: digital 
cameras, video cameras, and panoramic cameras; and iii) A global navigation satellite system (GNSS): GPS 
receiver, inertial measurement unit (IMU), odometers, speedometers, and accelerometers. 

LiDAR sensors are the main element of mobile data acquisition systems referring to MLS or MMS. In 
turn, each MLS and MMS platform has different applications. In this work, we only review vehicle-mounted 
MLS systems and MMS that use both 3D point cloud and imagery for urban application scanning. As previ- 
ously mentioned, we discard those platforms used for autonomous driving since we consider that these systems 
deserve their own study. Our goal is to describe the technologies into MLS system and vision system. Table 
[I]summarizes our review of vehicle-mounted MMS and MLS systems by including the information: i) Name 
of the system or institution that developed or integrated the system; ii) The model of laser scanner sensor; iii) 
The model of vision sensor including the field of view (FOV) and the resolution; iv) Model of GNSS/IMU 
receiver system; and v) The authors for reference. Also, we briefly describe how each system merged the MLS 
data with imagery data and which urban tasks was cover. Additionally, we introduce Table 22|to highlight the 
characteristics of the laser sensors comprised in each reviewed system, along with aspects such as data update 
rate, FOV, data acquisition range, and image resolution. We also consulted the datasheets of each system to 
obtain more details on its characteristics. 

The first platform reviewed concerns the Finnish Geodetic Institute (FGI) sensei, developed at the 
Finnish Geodetic Institute. According to Jaakkola et al. [21], FGI sensei is a modular measurement system 
capable of performing aerial scanning and MLS mapping. The system uses an Ibeo Lux laser scanner for 
individual tree measurements. The researchers used FGI sensei to classify tree species by integrating laser and 
hyperspectral data. The camera in this system can bring RGB synchronized point cloud data and images to 
identifying urban objects and a mapped overview of the environment [22]. FGI sensei also uses a spectrometer 
that measures incoming light by passing it through a diffraction grating to a monochromatic charge-coupled 
device (CCD) sensor. The spectral channels were averaged for data acquisition by binning the pixels on the 
CCD sensor into 123 channels [22]. 

The second platform reviewed is VIAMETRIS MMS a vehicle for road surveying developed by [23]. 
The system can extract road inventory (markings and signs) by reflecting feature information from four SICK 
LMS-291 LiDAR sensors. In turn, camera data help texturizes and interpret this information to show the results 
in a geographic information system (GIS) to be consulted through internal software. 
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System name or insitution Laser scanner sensor Vision sensor Field of view | GNSS/IMU receiver systems References 
FOV (Hx V) 
FGI Sensei Ibeo Lux AVT Pike F-421C 4 MP NovAtel SPAN-CPT Novatel 21), [22] 
2048 x 2048 Specim VIOH 702 GG 
44.4° V 
VIAMETRIS Four SICK LMS-291 AVT Pike F-210C 2.07 MP Trimble Omnistar 8200-H 23 
1920 x 1080 Ixsea LandINS 
eXperimental Platform RIEGL VQ-250 FLIR thermal SC-660 3.2 MP Ixsea LandINS 24], [25] 
(XP-1) 640 x 480 5-CCD multispec- 
tral camera 
East China Normal Univer- Two SICK n/s Two CCD cameras 1392 x GPS, INS and Odometer n/s 26 
sity VBLS 1040 
NAVTEQ True Velodyne HDL-64E Ladybug 3 1600 x 1200 HD GPS/AIMU/DMI n/s 27], [28] 
Prosilica cameras n/s 
National Polytechnic Insti- Velodyne HDL-64E Point Grey Ladybug2 1024 x ProMark 3 GPS CHR UM6 29]-[31] 
tute 768 gyro 
VISIMIND Leica HDS4500 Six 2-4 MP SONY digital Imar GMBH 32 
cameras GPS/GLONASS Topcon 
Stereopolis IL Two RIEGL LMSQ120i Sixteen Pike full HD cameras = POS-LV220 33 
Velodyne HDL-64E n/s 
Topcon IP-S2 Three 2D SICK n/s 360° Six digital camera GPS/GLONASS and IMU 34 
1600 x 1200 signlas tracker n/s 
Topcon IP-S3 HD1 Velodyne HDL-32E 360° LadyBug 5 30 MP (5 GPS/GLONASS and IMU 35 
MP x Six sensors) 2048 x signlas tracker n/s 
2448 
Optech Lynx Two Lynx sensor Four Lynx BB-500 GE 5 MP POS LV 520 Applanix 36]-[40] 
57° x 47° FoV 
SSW 360° laser sensor n/s Ladybug 3 1600 x 1200 GPS receiver and IMU n/s 41], [42] 
RIEGL VMX 250 Two RIEGL VQ-250 Six VMX-250-CS6 5 1.4 MP, IMU/GNSS n/s 43 
2 MP or 5 MP 
RIEGL VMX 450 Two RIEGL VQ-450 VMX-450-CS6 5 MP 2452 x IMU/GNSS n/s 42]-[49] 
2056 80° x 65° FoV Lady- 
bug5 30 MP (5 MP x Six sen- 
sors) 2048 x 2448 
Trimble MX-8 Two RIEGL VQ-250 Four Point Grey Grasshopper Applanix POS LV 520 50 
GRAS-50S5C 5 MP 2448 x 
2048 
Florida Atlantic University Velodyne HDL-32E Cameras—Nikon 3200, 3300 Geodetics, Geo-iNav 51 
24.4 MP 6045 x 4003 
Purdue University Two Velodyne HDL-32E SMP FLIR Flea-2G camera Novatel SPAN-CPT 52 
Utah Department of Trans- Velodyne HDL-32E Imaging technologies n/s Laser rut measure system n/s 53], [54] 
portation (UDOT) Laser road imaging Position orientation sys. n/s 
system n/s 
Wuhan University Three low-cost SICK n/s Ladybug 3 1600 x 1200 GPS/IMU n/s 55 
StreetMapper 30 RIEGL VQ-250 or VQ- DigiCAM K14 or Nikon D300 NovAtel OEMV-3 or High 56]-[58] 
450 SLR 12.3 MP 4288 x 2848 quality IGI navigation system 
Tsinghua University Velodyne HDL-64E Basler digital camera 1292 x n/s 59 
964 
MODISSA Two Velodyne HDL-64E Eight Baumer VLG-20C.1. Applanix POS LV V5 520 60 
Two Velodyne VLP-16 Jenoptik IR-TCM 640 thermal = INS/GNSS DMI/IMU 
infrared camera JAI CM 
200-MCL gray scale camera 
Jenoptik DLEM 20 laser 
rangefinder 
AnnieWAY Velodyne HDL-64E Two FL2-14S3M-C and two OXTS RT 3003 [61] 
FL2-14S3C-C Point Grey Flea 
2 
Innopolis University Velodyne VLP-16 Basler acA 1300-200uc NVO8C MTi-G-710 [62] 
KAIST Daejeon, South Ko- Two Velodyne VLP-16 Two FLIR FL3-U3-20E4C-C EVK-7P U-Block GRX 2 [63] 
rea Two SICK LMS-511 1280 x 560 SOKKIA MTi-300 Xsens 
LM13 RLS 
Teledyne Optech Maverick 32-line LIDAR sensor Ladybug 5 panoramic camera GNSS system [64] 


*no specified (n/s) 
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Table 2. Different laser scanner sensors in mobile data acquisition systems 


Laser scanner sen- Company name Data update Field of view Acquisition range Resolution Energy consump- 
sor rate Hz FOV (HxV) points/second tion 
Ibeo Lux Ibeo Automotive 50 Hz 110° x 3.2° 0.3 to 200 m 38,000 9 - 27 V 8 W (aver- 
Systems GmbH age), į 10 W (max) 
Leica HDS4500 Leica Geosystems 10 to 20 Hz 360° x 310° 1 mto25m Up to 500,000 24 V 50-70 W 
Lynx Optech Lidar Imag- 500 kHz 360° H 200 m max 1,000,000 12V30A 
ing Solutions 
RIEGL LMSQ120i RIEGL laser mea- 30 kHz 80° H Up to 150m 10,000 24V2A 
surement systems 
RIEGL VQ-250 RIEGL laser mea- 300 kHz 360° H Up to 500 m 100 18 - 32 V 180 W 
surement systems (max) 
RIEGL VQ-450 RIEGL laser mea- 550 kHz 360° H Up to 800 m 200 18 - 32 V 180 W 
surement systems (max) 
SICK LMS-291 SICK Sensor Intelli- 75 Hz 180° H 80 m 361 (infer) 24 V 30 W 
gence 
SICK LMS-511 SICK Sensor Intelli- 100 Hz 190° H 80 m 361 (infer) 24 V 22 W 
gence 
Teledyne Optech Teledyne Optech n/s 360° H +10° Upto100m Up to 700,000 12 V -36 V 
Maverick —30° V 
Velodyne HDL- Velodyne LIDAR 5 to 20 Hz 360° x 26.9° Up to 120m Up to 2,200,000 12-32 V 60 W 
64E 
Velodyne HDL- Velodyne LIDAR 5 to 20 Hz 360° x 41.33° Up to 100 m Up to 695,000 9-18V12W 
32E 
Velodyne VLP-16 Velodyne LiDAR 5 to 20 Hz 360° x 30° 100 m 300,000 9-18 V8W 


n/s no specified 


The third system is the experimental platform (XP-1) a MMS designed and implemented by [24] at 
Maynooth University. The innovative 5-CCD multispectral camera in XP-1 is capable of sensing across visible 
and infrared bandwidths. Kumar et al [25]. implemented binary morphological processes and road marking 
generic dimension data, and eliminated extra road elements by thresholding. 

The East China Normal University developed a SICK-vehicle-borne laser scanning (VBLS) [26], the 
fourth platform reviewed. The goal of the system was to extract street lamp distance data, and it demonstrated to 
be valid in terms of the accuracy of positioning and modeling ground targets. The method in the SICK/VBLS 
system first finds the nearest shooting position of each image record. Then, the system calculates distances 
between each laser point and each imaging position to discard laser points that lie beyond a distance threshold 
from the shooting position. 

Next, Chen et al. [27] and Babahajiani et al [28] implemented the NAVTEQ True MMS for urban 
applications, as fifth platform reviewed. On the one hand, Chen et al. [27] relied on the system to focus on 
facade-aligned and viewpoint- aligned street-level image data to improve city scales in reconstruction. The au- 
thors employed panoramic cameras to construct a visual database of omnidirectional images and query images 
captured using traditional perspective cameras. On the other hand, Babahajiani et al [28] used NAVTEQ True 
technology to develop a street scene semantic recognition framework by labeling datasets. The system makes 
a correspondence between 3D points and combinations of 2D imagery pixels. 

The sixth platform is from the National Polytechnic Institute of Mexico (IPN, by its Spanish acronym). 
They assembled its own MLS system for 3D urban reconstruction and conducted a sensitivity analysis of the 
laser sensor’s calibration and the panoramic camera [29]-[31]. The accuracy in terms of texture extraction is a 
function of the distance between sensors. Each 3D point is projected onto the panoramic image and classified 
according to its distance from the camera. From a similar perspective, Garcia-Moreno et al. [29] developed an 
automated 3D city reconstruction platform for geo-referenced 3D reconstruction of outdoor scenes. According 
to the researchers, the system can generate global textured models while preserving the geometry of the scanned 
scenes by using the information of uncertainty and sensitivity evaluation, and getting a good visual appearance. 

In collaboration with KTH Royal Institute of Technology, VISIMIND developed VISIMIND MMS 
[32], a system using imagery and laser data to obtain a geo-referenced inertial navigation system (INS), the 
seventh platform reviewed. In this approach, imagery and laser data help determine object position and attitude 
to back IMU navigation. Meanwhile the eighth platform reviewed, Stereopolis II was developed by [33] and 
emerged as a hybrid image/laser MMS. The system can capture a spatial data infrastructure compliant with 
several applications across the web, from multimedia immersive visualization to 3D metrology. Additionally, 
the street view application in Stereopolis I displays a pedestrian view of streets and map interaction to update 
precise urban maps. 
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As nineth and tenth platform reviewed is the IP-S2 and IP-S3 mobile survey systems from Topcon. 
The IP-S2 comprises a camera, a LIDAR sensor, and a GNSS system configuration. The system was created 
by [34] and allows for road element extraction and georeferencing by 3D to 2D re-projection of data and image 
processing. Likewise, the system has an automatic process forobject extraction from co-registered data. The 
IP-S’s counterpart, IP-S3 HD1 MMS, was developed by the same company and can acquire overlapping colored 
images and dense point clouds. In their work, Hussnain et al. [35] implemented a feature detection, extraction, 
and matching method using the two Topcon MMSs along with aerial orthoimages. 

The eleventh platform reviewed is the Lynx MMS from Optech. The Lynx can generate LiDAR 
and image data and has been implemented by [36]-[39]. Puente et al. [36] used Lynx MMS and obtained 
thresholds to classify eight-bit color histograms (RGB 0,0,0 zero - 255,255,255 white), whereas Riveiro et 
al. [37] managed to identify road elements by using point cloud intensity data. The method extracts road 
elements by thresholding intensity images. Another work, Soilan et al. [38] used Lynx MMS to re-project 3D 
synchronized road elements point cloud positions onto 2D pixels. Then, they used machine learning to identify 
the road elements. More recently, Safaie et al. [40], they developed an automated tree inventory based on 
Hough transform and active contours. Even though they assemble their own binary images, google images are 
used to compare the results. Arcos-Garcia et al. [39], their MMS gets traffic signs using the retro-reflective 
paint feature from RGB projected images. The work applies a deep artificial neural network (D-ANN) with 
convolutional and spatial transformer layers to extract traffic signs. 

On the other hand, Capital Normal University and the Beijing Geo-Vision Technology Limited Lia- 
bility Company jointly developed the vehicle-borne scanning system SSW MMS [41], [42]. The main data 
retrieved by this twelfth system are point clouds acquired by the laser scanner. Also camera images provide 
texture to implement object reconstruction and detection. Yang et al. [42], color and intensity features are used 
to create multi-scale super-voxels, while in the work of [41], 3D point clouds are the main data retrieved by the 
system, whereas textural information from cameras complements 3D urban environment model reconstruction. 

RIEGL laser measurement systems comprise 2D and 3D laser scanners suitable for mobile mapping 
applications. The thirteenth and fourteenth platform reviewed are RIEGL VMX 250 and RIEGL VMX 450. 
Moreover, they can register scanned data acquired from moving platforms. Landa and Prochazka [43] relied 
on the RIEGL VMX 250 system for sign detection by reflexivity filtering. Namely, since road signs contain 
highly reflexive paint, the information obtained from RGB images was the color of these signs. If compared 
to the RIEGL VMX 250 system, RIEGL VMX 450 has been used in a larger number of applications in [42], 
[44]-[48]. Yang et al. [42], the RIEGL VMX 450 system was used to generate a multi-scale super-voxel, where 
point attributes such as colors, intensities, and spatial length form the super voxel. The goal is to generate a 
segmentation by graphics and multiple cues as main direction and colors. Wu et al. [44], RIEGL VMX 450 
was used to develop a novel method for traffic sign detection and visibility. Said method uses the high retro- 
reflectivity of the traffic signs and a visibility estimation method. Yu et al. [45], the RIEGL VMX 450 system 
helped retrieve a collection of usual traffic sign pictograms and stamps from the Chinese Ministry of Transport. 
Then, the researchers used a Gaussian-Bernoulli deep Boltzmann machine (GDBM) to represent these signs 
and reduce the size of the stamps to an 80x 80-px square. 

Wen et al. [46], the RIEGL VMX 450 system was used to develop a spatial-related traffic sign man- 
agement procedure. The sign area is extracted from the point clouds. Then, the 2D image data is re-projected 
to the point cloud data for sign recognition. From a different perspective, You et al. [47] introduced a traffic 
sign identification and fast deterioration examination method for typical environments. The approach uses a 
deep neural network (DNN) and Fast R-CNN. 

Finally, Guan et al. [48] and Guan et al. [49] relied on RIEGL VMX 450 to propose a method for 
detecting traffic signs directly from mobile LiDAR point clouds based on prior knowledge on aspects such 
as road width, pole height, material reflectance , geometrical structure, and traffic sign size. Additionally, 
the system uses traffic sign image segmentation by projecting the detected traffic sign points onto the digital 
images. And more recently employed a convolutional capsule network model for classification. 

Guan et al. [50] tested the performance of Trimble MX-8, a commercial MLS system that generates 
rich survey-grade laser and image data for urban surveying. This fifteenth system reviewed was tested at two 
test sites in urban areas for road network update and management tasks. As its main capabilities, Trimble MX-8 
proved to be efficient in terms of extracting digital ground models, measuring tunnel height and road width, 
identifying traffic signs, reconstructing 3D building models, monitoring land-side, and configuring utility. 

Next are the sixteenth and seventeenth platform checked, in [51] and [52], the Velodyne-HDL32E 
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system was used to improve the accuracy of point clouds and imagery recordings. The eighteenth platform is 
in [54]. The work discussed a mobile-based data collection approach developed by the Utah Department of 
Transportation (UDOT). The systems developed comprise a laser scanner, an imaging sensor, a distance and 
crack measurement module, and other methods. The goal of the UDOT approach is to get high-resolution 
road sign images. First, daytime digital photos of road signs were captured. Then, trained operators examined 
such photos to rate the visual condition of the signs as good, fair, or low (GFP). From a different perspective, 
in collaboration with Wuhan LEADOR Spatial Information Technology Co., Wuhan University developed its 
own MMS. Cui et al. [55], this nineteenth system is used to propose a line-based registration approach for 
panoramic images and LiDAR point clouds. The researchers established the transformation model between the 
primitives from the two datasets in the camera-centered coordinate system. Also, using extracted features, they 
resolved the relative orientations and translations between the camera and the LiDAR. 

The twentieth platform is the StreetMapper 360 MMS and is mainly used for road mapping and urban 
environment reconstruction. Yadav and Chousalkar [56], this MMS is a part of a power line extraction method. 
The acquired point cloud is first organized as 2D gridded data, which take the shape of connecting pillars in 
3D. The 2D Hough transform was used on the image data to detect power lines as linear features. On the 
other hand, Yadav et al. [57], StreetMapper 360 MMS is used within a method for calculating road geometry 
parameters (i.e. width, centerline, longitudinal, and cross slope). Onsite manual measurement reference data 
were used to verify the correct functionality of the method. The data included road slopes diagrams using MLS 
data (XYZRGB format) of road surface points. 

The twenty-first platform is in [59], researchers from Tsinghua University handled a Velodyne HDL- 
64E system with a Basler digital camera (1292 x 964 resolution) to acquire both color and geometrical data. 
Planar objects are directly detected in 3D space from colorized laser scans containing both color and geomet- 
rical data. The authors applied a driving cuboid aligned along roadway boundaries, and the laser scans falling 
into the camera FOV can collect color data. 

The Fraunhofer Institute of Optronics developed the sensor vehicle MODISSA. The twenty-second 
platform reviewed allowed the development and testing of real time methods or high level driver assistance 
functions. The functionalities were applied in LiDAR-camera pedestrian detection methods [60]. Zhu et al. 
[65] used MODISSA to generate a unified thermal point cloud without the need for RGB images. The fusion 
helps to describes the radiance of building facade and to analyze thermal properties. 

The Karlsruhe Institute of Technology and the Toyota Technological Institute at Chicago (TTIC) de- 
veloped the AnnieWAY MLS system and used this twenty-third platform reviewed to develop a novel set of 
computer vision benchmarks, known as the KITTI vision benchmark. The task to accomplish and improve with 
the KITTI suite include stereo, optical flow, visual odometry, 3D object detection, and 3D tracking. Bruno et 
al. [61] applied the KITTI benchmark suite to their 3D traffic sign detection method to track traffic sign objects 
and their images. The method can identify traffic signs by integrating 2D and 3D data and building a semantic 
object interpretation. 

Velodyne’s VLP-16 sensor is the smallest advanced sensor in Velodyne’s 3D LiDAR product range. 
Buyval et al. [62] proposed a method on board an autonomous vehicle for road sign detection and localiza- 
tion using the VLP-16 sensor. The researchers used this twenty-fourth platform reviewed to implement their 
algorithm for road sign s classification and localization in a 3D space. The algorithm uses neural networks 
and points clouds obtained from a laser range finder. From a different perspective, Korea Advanced Institute 
of Science and Technology (KAIST) developed its own MMS. Jeong et al. [63] incorporate stereo camera 
data into KAIST MMS, the twenty-fifth platform reviewed to support vision based robotics research. As its 
main contribution, this approach provides data for a variety of environments, from downtown area to apartment 
complexes to underground parking lots. Also, the approach provides a baseline via a SLAM algorithm using 
highly accurate navigational sensors and a semi-automatic loop closure process. 

The Toronto-3D data set was acquired through the Teledyne Optech Maverick Weikai [64], the twenty- 
sixth platform reviewed. The dataset covers approximately 1 km of point clouds and consists of about 78.3 
million points. The inspiration for the data set was semantic segmentation to train deep learning models effec- 
tively. The data set is about 8 urban object categories as road, road marking, natural, building, utility line, pole, 
car, and fence. 

According to this review, RIEGL VMX-450, Optech Lynx Mobile Mapper, and Velodyne HDL-64 
are the most popular MLS tools applied in urban tasks. RIEGL VMX-450 is a robust integrated system with 
its own sensors; Optech Lynx Mobile Mapper, however, is less robust than RIEGL but includes both digital 
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cameras and GPS receivers. In turn, the Velodyne HDL-64 laser scanner sensor is robust but only comprises 
a laser scanner sensor. In conclusion, the choice of a given MLS system over others may largely depend on 
multiple factors (e.g. costs), yet integrated MLS systems are more appropriate. Also, MLS systems integrate 
GPS receivers and cameras, which grant simultaneous registration of visual and spatial data. Nevertheless, it 
is essential to maintain a cost-balance benefit. It is well known that one of the most common problems in the 
use of MLS systems concerns the existence of large occluded regions and segmentation. A not so expensive 
solution to this problem is to retrieve multiple scanned images of the same area; however, developments of 
segmentation methods are by themselves case studies. 


4. URBAN MANAGEMENT APPLICATIONS 

This section discusses the evolution of MLS technology combined with imagery for urban applica- 
tions. We list those systems using data processing, segmentation, detection, and classification methods. We 
also discuss the accuracy of the results provided by these systems and the evaluation methods used to assess 
their performance. Objects in urban environments include buildings, pedestrians, vehicles, road signs, trees, 
and animals, among others. Dynamic objects (e.g. vehicles, pedestrians, animals) are studied mainly for ob- 
stacle avoidance by autonomous vehicles or to analyze the flow of objects in physical space. On the other 
hand, static objects (e.g. trees, road signs, traffic lights, lamps, buildings, streets, and services) are studied to 
ensure their good condition and location and trace their change over time due to weather conditions or vandal- 
ism. Also, static object features serve multiple different purposes (e.g. safe transit, environment preservation, 
historic preservation, public space study, archaeological or architectural analyses). 

The initiatives reviewed in this section are systematically organized in four tables. Each table includes 
the following information: i) author names, ii) acquisition system, refer to the section 3, iii) urban object classes 
detected, iv) data processing methods, 3D point clouds, and imagery enhancing or reduction, v) detection and 
segmentation, methods implemented, vi) classification methods used, and vii) accuracy, performance, and 
results obtained. Table|3)summarizes our review of works dealing with mixed (i.e. dynamic and static) object 
detection. Then, Table/4|lists the reviewed developments on static objects (i.e. trees, road signs, traffic lights, 
lamps, buildings, streets, and services). Notice that studies on road signs are listed in a separate Table [5] due 
to the large number of scientific developments revolving around road sign detection and visualization. Finally, 
developments dealing with road elements (e.g. road markings, crosswalks, lanes, arrow signs, sidewalks) are 
summarized in Table 6. The following sections provide a thorough discussion of each Table. 


4.1. Urban management applications from mixed dynamic and static urban objects 

Table[3]introduces a compilation of works on static and dynamic urban object detection and classifi- 
cation. Detection of mixed urban objects is applied in scene analysis, roadside assistance, or obstacle avoid- 
ance, being vehicles, people, and urban elements such as trees the most commonly detected. Urban objects 
detection efficiently supports urban surveying, geospatial data acquisition by GPS, and safety assessment in 
pedestrian crossing environments. There is a scientific proof that mobile mapping is a reasonable means for 
analyzing specific safety parameters in urban environments. The first step in data processing is plane ex- 
traction. Segmentation of ground and facade points reduces the amount of information to be analyzed and 
facilitates problem-solving. As researchers [66]-[69] pointed out, methods such as the random sample consen- 
sus (RANSAC) and the M-estimator sample consensus (MSAC) are popular because their implementation is 
more accessible than other methods, such as gridding and voxelization. These two last techniques group 3D 
points into perceptually meaningful clusters with high efficiency. Also, supervoxelization allows for clustering 
spatially connective points within similar features [66], [42], [67], [68]. Supervoxelization is often preferred 
over other segmentation methods because of its basic processing units instead of original points in point cloud 
applications. Object detection and segmentation depend on the classes to be identified. If the goal is building 
and road reconstruction, data processing achieves object segmentation. For other objects, such as dynamic 
and static urban objects, vectors are usually favored. Feature vectors include geometry features (dimensions, 
color, and intensity), [28], [70], [66], [71]-[73], [69]. The connected component algorithm is usually used for 
segmentation, since it operates on organized point cloud data, [71], [74], [75], [69]. Also, the algorithm is 
based on the costly neighborhood and uses normal surface points in Euclidean space, commonly used in point 
clouds. Points are compared using a comparison function to determine the neighborhood of the point cloud. 
Since most works using urban objects are to discern among classifications, the classification process only needs 
to find general classes. In this sense, the main classifier methods include super vector machine (SVM), [74], 
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[71], [70], fuzzy logic [66], boosted decision trees [28], [67], [68], and convolutional neural networks (CNN), 
[75], [73], and [69]. All these methods consider machine learning techniques, whose objective is to learn by 
acquiring knowledge by training a given data set. Accuracy validation methods include the confusion matrix 
and the F-score [66], [75], [72]. The Fl-score is the harmonic mean of the precision and recall, where an F1- 
score reaches its best value at 1 (perfect precision and recall) and its worst value at 0. Results have shown that 
accuracy decreases as more object details are to be detected. Serna and Marcotegui [74] studied eight object 
classes: cars, pedestrians, noisy structures, dogs, house facades, chimneys, trees, and lampposts. The detection 
method segmented 78% of the objects accurately, and 82% of such well-segmented objects were correctly clas- 
sified using SVM and connected components. Babahajiani et al. [68] proposed a method that can classify eight 
different classes of urban objects — buildings, trees, cars, traffic signs, pedestrians, roads, water, and sky using 
a boosted decision tree detector. The method had a classification accuracy ranging from 83% to 67%. Also, 
Luo et al. [73] used three data sets with nine, six, and 14 classes, respectively. They implemented a CNN with 
feature vectors of the objects. The lowest accuracy result (74.9%) was obtained in the 14 object data set. 


Table 3. Urban management applications from mixed dynamic and static urban objects 


Reference Acquisition System Classes Data Processing Detection and Segmentation Classification Accuracy 
28 NAVTEQ True Sky, Building, Road, Tree, Car, 3D-2D projection between 3D features: height above Boosted decision trees, each 3D 88% 
Sidewalk, Sign-S, Fence, Pedes- patches and super pixels SPs camera, surface planarity feature vector with a semantic label 
trian and Water and reflectance strength 
70 Stereopolis I 3D buildings, 3D roads and a set of not specified Bottom-up algorithm given Rigid stereo pairs are used for 3D qualitative 
3D visual landmarks the objects geometric speci- recognition and modeling 
fications 
66 Velodyne HDL-64E, a Ground (road), building, water, Separate the ground points Bottom-up classification of Fusing result of Velodyne data and F-measure 
monocular camera and tree, grass, bush, pavement, sky and using a RANSAC plane fit- local image patches and top- image using the fuzzy logic infer- MRF: 0.25- 
other sensors obstacles (vehicles, pedestrian) ting algorithm and candidate down contextual analysis to ence framework, and smooth result 0.59, MFV 
obstacles are localized into further resolve uncertainties by the Markov random field based 0.24-0.53 
3D cubic voxel grid temporal fusion method 
74 Stereopolis IT Car, pedestrian, noisy structure, 3D point cloud projected to Connected objects are seg- SVM with geometrical and contex- det. = 98% seg. 
dog, house facade, chimney, trees, elevation images, segmented mented using a watershed tual features = 78% class, = 
and lampposts facades as the highest verti- approach 82% 
cal structures and eliminated 
small and isolated regions 
42 RIEGL VMX-450 and Buildings, streetlamps, trees, tele- Hough transform to filter the Multi-scale supervoxels us- Semantic knowledge 91% 
SSW graph poles, traffic signs, cars facade and Top-Hat for hole ing the point attributes (col- 
filling on the ground ors, intensities) 
71 FGI Sensei Trees, lamp posts, traffic signs, PCA and connect compo- Local descriptor images SVM and its C-SVM version with 87.9% 
cars, pedestrians, and hoardings nents to remove ground and (LDH), spin images, and the radial basis function (RBF) ker- 
buildings general features nel 
67 NAVTEQ True and Stere- Tree, car, sign, person, fence, Ground and building seg- Voxel based segmentation Boosted decision tree 91% 
opolis IT ground and building mentation by RANSAC, and 3D feature extraction 
maximal height filter and (intesity, areas and normal 
morphological operations angle) 
68 NAVTEQ True, and Velo- Building, tree, car, traffic sign, rule-based detectors for road Super-voxel features, sur- Boosted decision tree detector 83% and 67% 
dyne HDL32 pedestrian, road, water, and sky surfaces and building fa- face orientation by PCA, 
cades, RANSAC and 2D semantic segmenta- 
tion 
75 Velodyne HDL-64 Vehicles, pedestrians, short facades 2D grid based approach Fast connected component Theano: CNN-based feature learn- 89% overall F- 
and street clutter via point height informa- analysis for object sepa- ing framework rate 
tion: ground, low fore- ration, maximal elevation 
ground, high foreground, value and point cloud den- 
and sparse areas sity 
72. Optech Lynx Traffic light (Type 1), traffic light Pedestrian crossings extrac- Pole-like object segmenta- PCA to projected points into a F-score 95% 
(Type 2), street lamp, tree, and tion by intensity data, verti- tion by euclidean clustering 2D raster grid binary image, type 
other pole-like objects cal mean and variance com- and a geometric supervised 1: classification by a Cubic SVM, 
putation and a region grow- classification Type 2: a two layer feed forward 
ing algorithm for ground and neural network with sigmoid hid- 
non-ground elements den and softmax output neurons 
73 RIEGL VMX-450 HDRObject9, SMDObject6, not specified Three discriminative low-  JointNet, by jointing low-level fea- recognition: 
SUObject14 including bus station, level 3D shape descriptors tures and CNNs for 3D object 94.6%, 93.1% 
light-pole, road sign, station sign, for obtaining multi-view 2D recognition and 74.9% 
traffic light, traffic sign, trashcan, representation of 3D point 
trees, vehicles, pedestrian, etc. clouds 
69 Optech Lynx Bench, car, lamppost, motorbike, Planes elimination, MSAC Connect component e and colour to CNN object 99.5% 


pedestrian, traffic light, traffic sign, 
tree, waste-container, wastebasket 


classification 


4.2. 


Urban management applications from mixed static urban objects 


Table[4]}introduces a compilation of works dealing with static urban object detection and classification. 
Static urban object detection supports multiple purposes, such as hazard management, city planning, travel 
guidance, 3D reconstruction for exploration systems, and geographical information system (GIS) development. 
In turn, GISs allows for spatial data analysis, automatic change detection registration, geo-database updating 
in urban environments, and smart city management. 
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Reference Acquisition System Classes Data Processing Detection and Segmentation Classification Accuracy 

23 VIAMETRIS MMS Road boundaries, road markings RANSAC to extract road 3D NURBS for road curva- qualitative qualitative 

and traffic signs boarders and centers ture, road signs extraction 
by threshold on reflectivity 
value 
84 LiDAR system and Tree Point cloud filtered by the Watershed transformation Point-features-based matching al- comparison 
ground mobile truck data second order derivatives gorithm in stereo vision, Föstner 80%-90% 
(no specified) operator integrated 
75%-95% 

21 FGI Sensei Pole-like object and trees No specified Clustering the extracted ver- Cluster distribution inspection ver- comp. 90% 
tical line segments and anal- tically splited of a tree correc. 86% 
ysis of the spatial distribu- 
tion 

26 SICK-VBLS Lamps Density of Projected Points Maximum height of each Height threshold qualitative 
(DoPP) grid cell 

22 FGI Sensei Tree No specified Hyperspectral imagery and SVM two species 
laser overlapping informa- 95.8% three 
tion, height distributions species 83.5% 

2T Velodyne HDL-64E and Buildings Visibility mask for spher- Upright feature keypoints Query image vocabulary tree recall=95% 

Ladybug ical projection, panorama trained on SIFT descriptors, and 
overlapping perspective cen- geometric verification for PCIs 
tral images PCIs, perspec- (RANSAC with a 2D affine model) 
tive frontal image PFI, and and PFI (3 degree-of-freedom 
histogram equalization (DOF) scale and offset check) 

32. VISIMIND MMS Buildings facades GPS and IMU data is in- Feature extraction based on 28 control points measured total RMS value 
tegrated in post processing Harris operator, and use station with accuracy of 5 mm 0.011 m 
module (PPM) within the of Lucas Kanade Feature 
Kalman filter Tracker 

78 Optech Lynx Mobile Trees and buildings Divides the mobile lidar Weights of points, geo- Shape constraint and Z-direction Trees pro- 

Mapper points into grids on the referenced feature image profile analysis file 97.9% 
xOY and feature extraction image Building shape 
segmentation 100% 

41 SSW Trees Sloping adaptive neigh- Binary connected compo- Entity hierarchical extraction by qualitative (10 
borhood to removing the nent labeling density based height of the target cm error) 
ground and crown segmentation of 

overlapping objects 
36 Optech Lynx Mobile Luminaries Point cloud pre- Colored point cloud filtered Thresholds are extracted from the 100% in 95s 
Mapper segmentation based on by geometric and radiomet- color histogram analysis RGB 
height values ric features 
76 RIEGL VMX 250 Parking railing, boards, light pole, Point cloud images co- 2D Delaunay triangulation Evaluation of detection detec. 64%, 
house segment, hedge, fence, fire registration, terrestrial performed on the discrete 76.4%, 80% 
hydrant, warning tape, crane, chair, images free-network bundle points, multi-stereo image 
and bucket adjustment, control points inter-correlation, and graph 
manually collected from cut area based on the super- 
feature points and block pixel optimization 
window patches as a basic 
z-buffer unit to filter 

43 RIEGL VMX 250 Traffic signs, road markings and Reflexivity filtering Euclidian cluster extraction Segments processing by rules, point 93% 

general pole-shaped objects (e.g. number, centroid and height 
city lights or trees) 

83 RIEGL VMX-450 Tree Voxel-based upward- Euclidean distance cluster- Waveform representation, two 86.1% 
growing filtering ing and voxel-based normal- layers deep Boltzmann machines 

ized cut segmentation DBMs and SVM 

56 StreetMapper 360 Power lines 2D gridding and horizontal 2D point density based re- Quantitative accuracy assessment correctness 
segmentation based filtering finement to remove trees and 98.84% com- 
of horizontal segment building, Hough transform plet. 90.84% 

detection 

77 Velodyne HDL-32E Building and road marking Elimination of reflected 2D probabilistic occupancy qualitative, RMS position errors for RMS: 0.136 
parts of sidewalks or nearby grid map for vertical struc- the lateral and longitudinal direc- m lateral and 
vehicles through binariza- tures, line extraction using tions 0.223 m longi- 
tion Hough transform and the tudinal 

IEPF algorithm 
79 LiDAR sensor no speci- Doors and windows Cube mapping of spherical Pixel-wise mask and seman- No classification Visual evalua- 
fied and 360° camera panorama images tic segmentation on point tion 1 to 2 cm 
clouds through DBSCAN 
and connect components 
100] Optech Lynx Mobile Road vegetation Limit the edge of the road, Polygonal region segmenta- MSAC Mean geomet- 
Mapper increased point density tion within 10 m threshold ric error 0.25 m 
2.82% 
80. TOPCON MLS and a Building facades Ground filtering, edge cen- Geometric Filtering Kalman filtering RMS X 0.131 
Ladybug-5 camera ter and window detection RMS Y 0.135 
81 road-borne MLS system Fences Images cropped FCN predictions, image to PointNet Precision 95% 
n/s and a Ladybug 5 cam- point cloud transfer using Recall 87% K 
era perspective projection, and 0.89 
polygonal segmentation us- 
ing Hough transform 

82 RIEGL VMX-450 Building patches Constraining the area of Siam2D3D-Net: STN mod- No classification Qualitative 
keypoint throughthe loca- ule and modified VGG net- 
tion within a 50 m radius work, to learn the image fea- 
from the image GPS posi- ture, point cloud branch with 
tion PointNet 

40. Optech Lynx Trees Left-side and a right-side Foliage extraction by Characteristics measurements trunk Average error 
point cloud division, low- Voronoi tessellation and and foliage height and diameter in height ex- 
height points filtered active contour traction: less 

than 15 cm, 
TDBH average 
error: less than 
10cm 

[65] MODISSA Buildings Intrinsic calibration of TIR Harris corner detector and Find correspondences and 2D/3D false alarm rate 


images. Take of horizontal 
lowest plane and CC for the 
point clouds 


the line segment detector 
(LSD), Harris 3D 


registration by an automatic method 
based on restricted RANSAC 


= miss detected 
lines over de- 
tected lines less 
than 30% after 
5 pixels 


*n/s no specified 
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Developments extracting static urban objects process the data depending on the classes to be detected. 
For other urban object detection procedures, first, it is necessary to identify the ground and building points. Qin 
and Gruen [76], relied on a Z-buffer unit to filter points to segment three classes of static urban objects: parking 
railing, boards, and light poles. The authors used 2D Delaunay triangulation and supervoxelization for object 
detection. From a similar perspective, the works of [23], [77] used reflectivity values of road markings and 
traffic signs to segment and filter buildings. Both building extraction and monument extraction are highly useful 
in cultural heritage documentation, reverse engineering, and 3D objects reconstruction. Also, they help generate 
digital elevation models (DEM). Prior to object detection and segmentation, Chen et al. [27] and Gajdamowicz 
et al. [32] performed a data integration process to allow for more detail in point clouds. Integrating point 
clouds with imagery allows using 2D methods such as SIFT descriptors and the Harris Corners Detector. Yang 
et al. [78] focused on the task of building reconstruction by relying on tree detection and extraction to solve 
the occlusion problem. In this sense, MLS systems are possibly the best systems for detecting changes in 
objects eluding contact. Building detection can also deploy urban flood disaster and risk management plans 
[79]. Also, Ergun et al. [80] developed a building facade survey through a 2D Kalman filtering algorithm 
and a related laser data segmentation method. The method allows the acquisition of rectified images from the 
facade to CAD. Another work related is the work of [81]. It presents an image-based point cloud segmentation 
(IBPCS) method for filter the point clouds after semantic segmentation of images. They use the method in 
low dense point clouds as fence recognition. The detection and classification process was made using a fully 
convolutional network (FCN) and PointNet. Termal infrared (TI) images are used in [65] to evaluate the energy 
consumption and leakage of building. They fusion thermal infrared image sequences and the point clouds. 
The fusion describes the radiance of the facade in the building and helps to analyze thermal properties. For 
augmented reality, Liu et al. [82] used 2D image patches and 3D LIDAR point cloud from the urban scene. They 
proposed Siam2D3D-Net to achieve virtual-real registration high-quality using 2D-3D patch-volume dataset 
and retrieving the Euclidean space. 

According the researchers [21], [41], [83], classification of tree species is mainly performed for safety 
studies, noise modeling, and environmental and ecological analyses. Trees play a critical role in urban ecosys- 
tems since they help maintain the environmental quality and aesthetic beauty of urban landscapes. Moreover, 
trees are of social service for communities. Tree classification relies on tree height characteristics and demon- 
strates that biomass changes can be mapped with relative facility using laser collections of the same tree. Gong 
et al. [84] collected tree data such as counting and dimensions for effective tree management and quantita- 
tive tree analysis in urban areas. Likewise, Zhong et al. [22] introduced hyperspectral sensors to classify tree 
species, mainly coniferous and deciduous trees. The first step in tree detection using MLS systems and imagery 
data is ground point filtering, followed by shape detection considering pole-like object, height, and crown den- 
sity. Safaie et al. [40] they created a tree inventory from an MMS point cloud. They start extracting the trunk 
by the Hough transform (HT) followed by the tree foliage via Voronoi tessellation (VT). Then, a density image 
is created using the number of points as the gray value of each pixel. 

Detecting urban objects such as street lamps, traffic signs, street light poles, and power lines starts by 
detecting pole-like objects, as in tree detection. The researchers [26], [36], [43], [56], the authors based their 
urban object detection approaches on point cloud density, height filter, and geometric and radiometric features. 
An RGB spectrum is a helpful tool for object differentiation using a color filter and threshold color histograms. 


4.3. Road sign detection and classification 

Table[5]lists relevant works that deal with or propose methods for traffic sign detection and classifica- 
tion. Traffic signs worldwide share standard features but also respond to local classifications. All strictly static 
urban objects have particular physical characteristics for safety and do not obstruct other activities. Namely, 
traffic signs have specific standards to be easily and quickly identified by drivers and pedestrians. Their spatial 
characteristics allow us to differentiate elevated traffic signs from lower traffic signs; geometric shapes make it 
easy to know whether a given sign is informative, restrictive, or preventive. Also, visual features, such as color 
and texture allow us to know the message being transmitted. Moreover, road signs are coated with reflective 
paint; hence, they reflect back the light from car headlights in order to improve readability. This feature is 
advantageous when using laser sensors for road sign detection and classification, since sensor-based systems 
can retrieve color intensity data. 
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Reference Acquisition System Data Processing Detection and Segmentation Classification Accuracy 

101] Velodyne LiDAR sensor Many sign attributes, such as loca- color, sheeting type, age, and geographic Visual nighttime inspection, and Retro re- conditional proba- 
system and associated pho- tion, size (length and width), mount conditions, in order to predict their retro- flectivity measurement bility, qualitative 
tolog height, collection date, and facing reflective degradation over time 

direction were measured 
44 RIEGL VMX-450 Surface extraction from point Geo-referenced relations according to the Evaluation of the visibility level based on Average deviation 
clouds by reflectance and geomet- normal of ground and four image features: a combination of visual appearance and under 5% 
ric characteristic intensity, color histogram, edge contrast, spatial-related features 
and proportion of traffic sign 

37 Optech Lynx Mobile Map- Horizontal intensity images, thresh- DBSCAN and subsequent curvature analy- Shape descriptors, 3-D points intensity im- completeness 

per olding and evaluation of histograms sis with PCA age 92.11% correctness 
with GMM 93.96% 
38 Optech Lynx Mobile Map- GMM with two components non- DBSCAN PCA Geometric parameters: centroid, height, 98% 
per reflective and reflective points etc; and color: (HLS) bitmap using HOG 
and SVM 
46 RIEGL VMX-450 Highly retro-reflective vertical Centroid coordinates of the bottom ring, a SVM classifier trained by a mix feature of detection 
plane horizontal profile with a thickness size HOG and color descriptor 91.63% 
92.61% precis. 
96.32% 
45 RIEGL VMX-450 Voxel cloud connectivity segmenta- Bag-of-visual-phrases representations Gaussian-Bernoulli deep Boltzmann 97.54% 
tion (VCCS) machine-based hierarchical classifier 

39 Optech Lynx Mobile Map- K- Nearest-Neighbor algorithm is Geometric and radiometric features, 3D 99.71% 
per used to obtain the closest voxel data projected on 2D RGB images S 

47 RIEGL VMX-450 No specified Fast R-CNN training for detection and Traffic sign damage inspection: tilted pole, detec. = 92.42% 

coarse corresponding relationship between deformed board, fallen board or pole and 
the image and the point clouds disappeared 

59 Velodyne HDL-64E S2 and Driving cuboid aligned along road- Laser reflectivity and RGB, HSV, and CIE 3D geometric characteristics of planar ob- detection 95.87% 
Basler digital camera way boundaries and colorized laser L X a X b color spaces, SVM jects HOG recognition 95.07% 

scans 

6l AnnieWay (KITTI database) K-means and background subtrac- Neural Network with binary output board Deep Learning CNN to classify type: max- classi. = 97.64% 

tion detection, “object signature”, the 3D-CSD imum speed, stop, preferential, pedestrian detec.= 76% 
(3D-Contour Sample Distances) descriptor or other 

48 RIEGL VMX-450, four knowledge of pole height and road detect traffic sign interest regions based on digital images, supervised GB-DBM detection 86.8% 
CCD cameras, a set of width to remove points on intensity information and geometrical classification 
Applanix POS LV 520 structures 93.3% 

54 Velodyne LiDAR sensor No specified Sign location, size, color, condition Random Forests model and Odds ratio qualitative sign in- 
system and associated pho- spection by contin- 
tolog gency tables 

62 Velodyne VLP-16 Depth filtration Bounding box and sign location on an im- Faster-RCNN architecture from 30% to 96% 

age to find a corresponding to the sign 
point cloud 
86 RIEGL VMX-450 Objects on both sides of the lane Semantic and spatial properties (location, Deep neural network: YOLOv3 and FCN mx-pres. 95.8% 
based on the trajectory data and in position and geometric features) model mx-recll 99.25% 
a distance d are retained mx-fl-m 95.77% 
mx-qual 91.89% 
102] RIEGL VMX-450 Quantitative representation of the Visual recognizability field and Traffic Parameter Sensitivity Analysis occlusion rate: 95% 
visibility and recognizability of Sign Visual Recognizability Evaluation verification 94.89% 
traffic signs within Sight Distance Model (TSVREM) 
(SD) 
87] RIEGL VMX-450 No specified Pole height, road width, intensity, geomet- Points are projected onto the images to ap- recognition rate 
rical structure, and plate size from LiDAR ply a Convolutional Capsule Network 0.957, detec. 
data 86.8% 
88] RIEGL VMX-450 A curb-based filtering method to Euclidean clustering algorithm to ex- Convolutional Capsule Network recognition rate 
divide mobile LiDAR data into tract pole-like objects and retro-reflectance 0.965, detec. 
ground and off-ground points properties 86.8% 
85] AnnieWay (KITTI database) Ground and building filtering re- Object segmentation through 3D point Classification through geometric shape as- precision 0.88, 
garding the driver trajectory cloud density and retro-reflective material sociation, local features description for se- 
feature. Color segmentation using HSV mantic data 
model 
49] RIEGL VMX-450 Curb-based filtering method, Euclidean clustering algorithm to ex- Deep Learning: convolutional capsule net- Recognition: 0.965 


tract pole-like object, next retro-reflectance 
properties to extract traffic signs, and Im- 
age re-projection 


work 


For traffic sign detection, developments take advantage of the highly retro-reflective property of the 
vertical plane. After ground point segmentation, laser points intensity values help segment traffic signs from 
other objects. Researchers [37], [38] used a Gaussian mixture model (GMM) and a density-based-approach 
(DBSCAN) to filter traffic signs. The GMM is a probabilistic model that can be thought of as generalizing 
k-means clustering to incorporate information on the covariance structure of the data and the centers of the 
Gaussian distributions. Traffic sign classification depends on the number of details or characteristics to be 
identified in the signs. Works such as those proposed by researchers [38], [46], [59] relied on the histogram 
of oriented gradients (HOG) to classify road signs. HOG is a feature descriptor using the gradient distribu- 
tion directions. Gradients of an image are useful because the gradient magnitude is large around edges and 
corners. Researchers [38], [46], [59], used SVM to classify the HOG. Machine learning simplifies the traffic 
sign classification task by using ground truth. Tan et al. [45] and Yang et al. [48] used a GDBM-based hierar- 
chical classifier. The GDBM uses Gaussian units in the visible layer of the deep Boltzmann machine (DBM); 
however, DBM is an ANN model where each neuron in the intermediate layers receives both top-down and 
bottom-up signals, thus facilitating uncertainty propagation during the inference procedure. Yang et al. [85] 
used the KITTI database to road signs segmentation. The object segmentation is divided in 3D point cloud 
and image segmentation. The first is by 3D point cloud density and retro-reflective material feature. The color 
segmentation is using HSV model. The preliminary classification is through geometric shape association, local 
features extraction and description for semantic data as numbers, characters, and drawings. 
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In this sense, semantic detection has simplified the traffic sign detection task. 3D information is re- 
projected onto 2D images, and D-ANNs are used for sign classification, as discussed in [47], [39], [61], [62], 
[86]-[88]. These systems use CNNs hierarchically organized based on supervised learning and contain several 
specialized hidden layers. This organization allows the first layers to detect lines and curves and specialize 
until they reach deeper layers that recognize complex shapes, such as faces or an animal silhouettes. Guan et 
al. [49] developed an automatic traffic sign detection and recognition method. They employed an euclidean 
clustering to extract pole-like objects and the retro-reflectance properties for traffic signs. Classification was 
made using a convolutional capsule network. 


Table 6. Road element detection and classification 


Reference Acquisition Sys- Classes Data Processing Detection and Segmentation Classification Accuracy 
tem 

34 IP-S2 HD Lanes, arrows sign and Point cloud geographical re- Labeling object isolation and mor- Template matching for arrow qualitative 

crosswalks projection, geometric filtering, phological indicator, peak detector marks, and crosswalks detection by 
interpolation on a regular grid for lane detection, and attributes morphological indicators 
(raster 2D image) generation 

92 RIEGL VMX- Longitudinal markings, Vehicle trajectory raw data parti- Geo-referenced intensity image Image segmentation by a point- complet. 0.96, 

450 transverse markings, object tioned, and small height pseudo via an extended inverse-distance- density-dependent multi threshold correct. 0.83, F- 
markings, and special mark- scan-lines for jumps detection in weighted, and local and global method to recognize road markings measure 0.89 
ings road curbs intensity data 

96 RIEGL VMX- Road manhole and sewer Rasterization of road surface points Marked point of disks and rectan- Reversible jump Markov chain complet. 

450 well covers into a 2-D georeferenced intensity gles to model the locations of man- Monte Carlo algorithm, simulation (95.16%),cor- 
image improving the inverse dis- hole and sewer well covers and their of the posterior distribution using a rect. (97.25%), 
tance weighted geometric dimensions Bayesian paradigm and quality 

(92.67%) 

25 eXperimental Continuous line, broken Range dependent thresholding to Binary morphological operations Comparative analysis by the length detec. 90.91% 
Platform (XP-1) line, words, zig-zag, hatch, the intensity values, and range at- and priori knowledge of the dimen- and average width of the final ex- and 88.43% 

and arrow tributes convert into 2D raster sur- sions tracted road markings 
faces 

93 RIEGL VMX- Road markings Partition the point cloud into blocks Multi-segment thresholding strat- Large-size marks based on trajec- complet. 0.93, 

450 along the trajectory, a profile is gen- egy using the Otsu’s, Euclidean dis- tory and curb-lines; deep learning- correct. 0.92, F- 
erated perpendicularly, curb points tance clustering, voxel-based nor- based small-size marks; and PCA measure 0.93 
are located within each profile malized cut segmentation rectangular-shaped marking classi- 

fication 
97 Stereopolis II Road markings Orthophoto-like LiDAR image gen- Reversible-Jump Markov Chain Local bundle adjustment (LBA), Maximum error 
eration by vertically projected point Monte Carlo (RJMCMC) sampler and uncertainty propagation ap- 0.4m 
cloud onto a horizontal plane (in- coupled to detect occurrences of plied to estimate pose parameters 
tensity and height) road marking and covariance 
51 Velodyne- Sidewalk, median, guard Road centerline and road width are RANSAC to ground points, poles Attributes: dimensions, curbs, con- qualitative 
HDL32E rail, fencing, lighting, land- extracted from the TIGER (Topo- and lamp posts; road marking and dition, geometry, message, and type 

scape areas, delineators, logically Integrated Geographic En- road signs by edges from the im- 

lanes, road Markings, and coding and Referencing) dataset ages, and point cloud intensity; 

road signs/boards building by planar property 

35 IP-S3 HD1 Road markings Point cloud data tiles cropped, 3D Harris corner detector, adaptive k-NN based descriptor matching qualitative by 

point grey values calculation of the approach for dynamic threshold and Homography (computed with evaluation of the 
corresponding pixels in raster im- computation, and Learned Ar- RANSAC) matching result 
age, and nadir aerial ortho image rangements of Three Patch Codes 
projection (LATCH) 

90. Optech Lynx Pedestrian crossings and ar- Curb pavement segmentation by k- Raster image creation and road Set of binary image features: GBF F-scores ex- 
Mobile Mapper rows means and intensity filter within markings detect by Otsu binariza- 4mo 

GMM tion and CC classif. 96% 

91 Optech Lynx Sidewalk, pavement, and Intensity values normalization, Ground extraction via region grow- Reflective materials, standard devi- F-score detec. 

Mobile Mapper road markings saliency analysis to segment z-axis ing, road curb extraction by ground ation filter on the pavement inten- 95% (pave- 
point clouds and k-means, height limi nd fusing heuristic and su- sity image in a 3-by-3 pixel neigh- ment/sidewalks), 
filter within a 2D raster image pervised learning methods borhood detec. 80% (road 

marking) 

98 Five 3D LIDAR Curbs ERFNet semantic segmentation, Semantic labels analysis Curb’s lower and upper edge points precision 80% 
sensors, and a color projection of the semantic searching, monotonically ascend- recall 60% 
multi-camera image pixel ing region and vertical structure 
network 

99 RIEGL VMX- Dashed line, text, straight ar- 3D point clouds are first projected Segmentation network U-net to A multiscale (distance-based Eu- U-net-based Pre- 
450 row, turn arrow, diamond, onto a horizontal plane and gridded classify every pixel clidean) clustering algorithm to cision 95.97% 

triangle, lane line, and cross- as a 2D image large size road markings, and a Recall 87.52% 
ing CNN classifier to small size road Fl-score 91.55% 
markings 

94 RIEGL VMX- curb-based road surface ex- Vehicle’s trajectory data, MLS Inverse distance weighting method, Sparse and unorganized road mark- recall 90.79%, 
450 traction and multi threshold point clouds partition into a se- intensity and local-global elevation ing points clustered into topological precision 

road marking extraction quence of data blocks, correspond- data, MLS road surface interpola- and semantic objects using the con- 92.94%, and 
ing profile sectioned with a certain tion and Otsu for 2D intensity im- ditional Euclidean clustering Fl-score 91.85% 
width ages 

95 SSW Manhole covers Intensity-based images generation, HOG descriptor, PCA, symme- Dimension evaluation by sector de- 96.18%, com- 

fluctuation trend in the elevation to try characteristic, and shape detec- composition diagram for manhole plet. 94.27%, 
ground points extraction tion, graph-based image segmenta- maintenance analysis F-measure 
tion method, OneCut 95.22% 
89 RIEGL VMX- Road boundary The erroneous boundary removal CNN-based method for 2D bound- Matching taxi GPS trajectory points 91.34-92.14 
450 AnnieWAY is treated as a binary classification ary completion, and Euclidean dis- via boundary line images with cen- complet. 89.87- 
through a U-Net model tance to partition the boundary terline 95.91 correct. 
points into separated line clusters 82.81-88.65 
quality 

103] Five LiDAR and Vegetation, road, curb, lane Semantically labeled images and Extraction through semantic classes Detection evaluation avg distance 0.2- 
four monocular marking, terrain and side- projecting each LiDAR point onto from region of consecutive points 0.34 m prec. 63- 
cameras walk the images 84.2 

104] Velodyne VLP- Road markings Non road point filter based on the Road surface extraction by a mov- Marker edge detector with an in- Recall 90%, 
16 height difference ing fitting window filter from each tensity gradient and statistics his- precision 95%, 

pseudo-scan line togram MCC 92% 
105] Dataset! Optech Road markings Non-ground filtering and section A number of candidate are detected Fuzzy inference system Avarage 88% 


SGI and 
Point Grey 
Dataset2 RIEGL 
VMX450 


alignment 


using HT algorithm 


Fl-score and 
87% 


IAES Int J Rob & Autom, Vol. 11, No. 2, June 2022: 89-110 


IAES Int J Rob & Autom ISSN: 2722-2586 o 103 


4.4. Road element detection and classification 


Table 6 lists relevant works dealing with or proposing methods for road element detection and clas- 
sification. Wen et al. [89] obtained critical information on urban roads ensures traffic safety. Road elements 
comprise curbs, lines, and road markings, among others. Road inventory is essential for adequate transporta- 
tion management, advanced driver-assistance systems (ADAS), road network maintenance, traffic analysis, and 
traffic inspection. The first step in road element detection is conducting ground point filtering . Height filters 
can help extract the rest of the urban objects. Works searching for curbs and sidewalks start with a vertically 
2D projection of point clouds. Then, they use intensity and height data to identify the elements. On the other 
hand, the task of searching for lines, arrows, pedestrian crossings, and dashed lines begins by conducting a 
horizontally 2D projection of segmented ground point clouds. Intensity values also help detect these elements. 
Researchers in [90], [91] use k-means and intensity filters jointly for detecting road elements. The first work 
detects pedestrian crossing and arrows using geometry based feature (GBF) extraction and classification. In 
the second work, sidewalk and pavement were identified by segmenting points whose normal vectors were 
close to the z-axis. 2D image generation helps identify urban elements of interest by 2D image methods. 2D 
image representation of road marking looks like gray or binary images. Researchers [34], [25] relied on the 
morphological method to process the information and detect road markings. Similarly, in [92]-[94], researchers 
applied the Otsu thresholding method to extract urban elements of interest. Other 2D image detection methods 
include Harris corner detector [35], principal component analysis (PCA) [93], HOG [95], and GBF extraction 
and classification [51], [90]. As reported in the literature, many methods can be used to assess detection ac- 
curacy, including Markov chain Monte Carlo (MCMC) methods, the Bayesian paradigm [96], [97], and CNNs 
[98], [99]. 


5. DISCUSSION 

This review discusses the general applications of MLS systems merge with vision systems in urban 
management tasks. According to the review and Figure [2] RIEGL VMX-450, Optech Lynx Mobile Mapper, 
and Velodyne HDL-64 are the most popular MLS tools for managing urban tasks. RIEGL VMX-450 is a robust 
integrated system with its own sensors; Optech Lynx Mobile Mapper, however, is less robust than RIEGL but 
includes both digital cameras and GPS receivers. In turn, the Velodyne HDL-64 laser scanner sensor is robust 
but only comprises a laser scanner sensor. In conclusion, the choice of a given MLS system over the others may 
largely depend on multiple factors, such as costs. Integrated MLS systems are more appropriate since while the 
cameras can capture photogrammetric information then LiDAR can extract geometrical data, and GPSs retrieve 
object global location. Most of the works dealing with or proposing alternatives for urban object classification 
share the following standardized methodological structure: i) They perform geo-referenced point cloud and 
imagery data acquisition; ii) The data on the ground, facades, and the remaining objects are processed (i.e. data 
processing); iii) They propose a specific segmentation and detection method for the particular problem to be 
solved, and iv) Finally, they propose a classification method and assess the accuracy of the classified results. 

Tendency in Figure B] shows that 3D point cloud and imagery detection and classification problems 
regarding urban objects are growing at their own step. They apparently remain constant as we can see in 
the Figure [3| The mixed dynamic and static urban objects and mixed static urban objects works number are 
almost the same year by year. In contrasts, since approximately 2015 works about road sign and road elements 
detection and classification are increasing. 

Contemporary initiatives to accomplish urban tasks are searching for specific goals with new propos- 
als. For instance, buildings detection and reconstruction intend to retrieve occlusion and texture. Hyperspectral 
cameras are useful in tree, building energy, and pedestrian detection. Traffic elements are labeled using high 
retro-reflective paint for easy identification at night. Likewise, road elements are detected and discerned among 
them by intensity values (e.g. sidewalks, curves, lines, pedestrian crossings). The literature also discusses the 
detection and classification of retro-reflective and non-retro-reflective elements. Finally, 3D point cloud and 
2D texture projection allows experts to apply well-known image processing methods. 

In the use of MLS systems and vision systems, a disadvantage persists concerning to manual and 
offline algorithms training. Algorithms are hard run in real-time, and a part of the object detection and clas- 
sification process is performed manually or offline. Thus, the training work depends on human experience 
and knowledge, and algorithm robustness bets on different lighting conditions, shadows, and urban landscape. 
It also seems that when searching for two or more urban object classes, avoiding extracting greater details 
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improves accuracy. Works detecting different objects hardly extract greater details. In this sense, semantic 


features seem to contribute to the solution, since they serve as the basic conceptual description of meaning for 
any element and contribute to having large labeled data sets. 
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Figure 3. Tendency in our tables organization per year: i) mixed dynamic and static urban objects, ii) mixed 
static urban objects, iii) road sign detection and classification, and iv) road element detection and classification 


6. SUGGESTED FUTURE RESEARCH 

MLS systems that integrate GPS receivers and cameras grant simultaneous registration of visual and 
spatial data. Nevertheless, the high cost of these system features may imply that the current solution is a disad- 
vantage. The developments explored in this review share some overall limitations and challenges. Two of the 
most common problems are the existence of large occluded regions and segmentation. Different authors pro- 
pose learning and searching for occluded parts. They suggest that several scan images of the same area could 
mitigate this problem; however, segmentation methods are by themselves a case of study. Another critical 
issues are the under-segmentation and over-segmentation of point clouds. The improvement of segmentation 
processes have to include an analysis of object density, shape, color, and texture; some promising works add 
intensity gradient, spectral features, and geometric features. We also found that the gray-scale and binary in- 
formation are the most used features for object segmentation, followed by the RGB color space; however, other 
color spaces such as HSV or CMYK can improve the segmentation and classification of different objects by 
relying on a broader range of color shades. MMS including LiDAR and cameras are applied to other urban 
tasks and can consequently benefit a range of industries and disciplines. This review provided an overview 
of the multiple industries benefitting from MMS developments for particular application: i) Automotive in- 
dustries: vehicle statistics analysis, accident statistics analysis, electronic unit development for road assistance 
and vehicle driving; ii) Urban planning and 3D reconstruction: road dimensions and directions, identification 
of sidewalk characteristics, building classification, bus stop detection, light detection, ramp detection, pedes- 
trian crossing detection, and road sign identification. Management of transportation, commerce, recreational, 
and environmental projects, and study of touristic places; iii) Construction industries: drainage system man- 
agement, complementary works, signaling dynamics, weather forecast, and obstacle detection for road sign 
installation; iv) Data collection companies: socio-demographic data collection, trade applications, construc- 
tion, employment home and housing projects, map development, environmental studies, population, transport, 
and tourism; and v) Telecommunication and other digital technology companies: utility pole detection, street 
name sign detection, road sign detection, identification of regional borders, railroad detection, green area data 
collection, and socioeconomic data collection. 

New urban management developments must contemplate as sources of data government databases in 
order to meet their goals. Finally, some of the reviewed works discuss the feasibility of a further fusion of their 
systems with complementary sensors, such as radars, video cameras, and encoders, to name a few. 
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7. CONCLUSION 


This review discussed the use of 3D point cloud data with imagery in urban management applications. 
The goal of the review was to highlight the MMS and MLS systems currently available for handling urban 
management tasks. Additionally, our review discussed current trends in urban object segmentation, detection, 
and classification. We considered urban management applications such as historic preservation, roadside as- 
sistance, road infrastructure inventory, and public space study. Urban element detection aims at maintaining 
order between dynamic and static variables. Dynamic object detection (e.g. pedestrians and cars) is mostly 
conducted for self-driving assistance, driver assistance, and traffic scene analysis. On the other hand, detec- 
tion of road markings and road signs contribute to successful traffic inspection, road system maintenance, and 
driver safety. Urban elements are key guidance assets within the road system; hence, pedestrians and drivers 
have to be easily detectable. Recurrent inspections of road elements guarantee their visibility, maintenance, and 
ability to provide the necessary road information. Unfortunately, the costs of road infrastructure maintenance 
increase as roads deteriorate. From this perspective, MLS systems promise to be a feasible solution to prevent 
serious road deterioration through constant monitoring, even though it is necessary to prove the advantages of 
both automatic and manual acquisition inspections. The latest works on road sign detection revolved around 
using D-ANNs, implying that this path might be the best for urban element identification. Even though the 
results are favorable, there is still some uncertainty surrounding the use of such a powerful tool for a small 
problem. Choosing deep learning and classical methods of machine learning depends on several factors, such 
as the amount of data to be identified. For some urban elements, their identification is sometimes troublesome, 
even with well-known and well-defined variables occluded, altered, or damaged. In this case, it is advised to 
consider both D-ANNs and machine learning options. 
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